Model-Based Planning with Discrete and Continuous Actions
نویسندگان
چکیده
Action planning using learned and differentiable forward models of the world is a general approach which has a number of desirable properties, including improved sample complexity over modelfree RL methods, reuse of learned models across different tasks, and the ability to perform efficient gradient-based optimization in continuous action spaces. However, this approach does not apply straightforwardly when the action space is discrete. In this work, we show that it is in fact possible to effectively perform planning via backprop in discrete action spaces, using a simple paramaterization of the actions vectors on the simplex combined with input noise when training the forward model. Our experiments show that this approach can match or outperform model-free RL and discrete planning methods on gridworld navigation tasks in terms of performance and/or planning time while using limited environment interactions, and can additionally be used to perform model-based control in a challenging new task where the action space combines discrete and continuous actions. We furthermore propose a policy distillation approach which yields a fast policy network which can be used at inference time, removing the need for an iterative planning procedure.
منابع مشابه
The Importance of Being Discrete: Learning Actions through Interaction
A robotic agent experiences a world of continuous multivariate sensations and chooses its actions from a continuous action space. Therefore, hand-coding knowledge suÆcient for successful planning in uncertain, dynamic environments is a diÆcult task. We present a method whereby an unsupervised robotic agent learns to discriminate discrete actions out of its continuous action parameters. These ac...
متن کاملGenerative Planning for Hybrid Systems Based on Flow Tubes
When controlling an autonomous system, it is inefficient or sometimes impossible for the human operator to specify detailed commands. Instead, the field of AI autonomy has developed goal-directed systems, in which human operators specify a series of goals to be accomplished. Increasingly, the control of autonomous systems involves performing a mix of discrete and continuous actions. For example...
متن کاملMixed Propositional Metric Temporal Logic: A New Formalism for Temporal Planning
Temporal logics have been used in autonomous planning to represent and reason about temporal planning problems. However, such techniques have typically been restricted to either (1) representing actions, events, and goals with temporal properties or (2) planning for temporally-extended goals under restrictive conditions of classical planning. We introduce Mixed Propositional Metric Temporal Log...
متن کاملA MODEL FOR MIXED CONTINUOUS AND DISCRETE RESPONSES WITH POSSIBILITY OF MISSING RESPONSES
A model for missing data in mixed binary and continuous responses, which can be used on cross-sectional data, is presented. In this model response indicator for the binary response can be dependent on the continuous response. A closed form for the likelihood is found. For data with a complicated pattern of missing responses some new residuals are also proposed. The model of multiplicative heter...
متن کاملPlanning and Control of Marine Floats in the Presence of Dynamic, Uncertain Currents
We address the control of a vertically profiling float using ocean-model-based predictions of future currents. While these problems are in reality continuous control problems, we solve them by searching a discrete space of future actions. Additionally, while the environment is a continuous space, the ocean model we use is a discrete cell-based model. We show that even with an imperfect model of...
متن کامل